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DESCRIPTION 



AUTOMATIC KEYWORD EXTRACTION APPARATUS, 



METHOD, RECORDING MEDIUM AND PROGRAM 



TECHNICAL FIELD 



The present invention relates to an apparatus, a method, a 



recording medium and a program which extracts a keyword 
automatically from title character string information and 
detailed character string information of contents such as EPG 
(Electronic Program Guide) information. 
BACKGROUND ART 

In a digital television broadcast which has got into full 
swing in recent years, EPG information including information 
designating a program title (title character string information) , 
information explaining detail of the program (detailed character 
string information) , information designating a genre of the 
program or the like is transmitted from the broadcast station 
together with video and audio data of the program. In a 
television receiver designed to have correspondence with the 
digital broadcast, it is possible to display an electronic 
program guide on a screen according to the EPG information. 

Further, there is also an analog television broadcast in 
which such EPG information is transmitted. 

In a case when a user searches a program he wants to watch, 
he utilizes this electronic program guide so as to search from a 
title, to search by reading detailed character string 
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information or the like after selecting a rough genre (for 
example^ sport, drama or the like) . 

However, how to attach a title of a program has infinite 
variety, so that it is not always easy for a user to perform a 
search from a title. Also, detailed character string information 
of a program is described in the form of a sentence and it is 
not rare that numbers of pages are covered for this, so that it 
is troublesome for a user to search from the detailed character 
string information . 

On the other hand, it becomes very easy for a user to 
search if a program search is made possible, for example, by 
using a keyword of a name of a professional entertainer or the 
like. However, a keyword is not included independently in EPG 
information transmitted from the broadcast station at present . 
Therefore, it becomes necessary to extract a keyword from the 
EPG information in order to make the search using a keyword 
possible . 

Heretofore, there existed a method as an extraction method 
of the keyword in which a user appoints head and tail end words 
of a character string which are desired to be determined as a 
keyword within a sentence of a detailed character string 
information in an electronic program guide displayed on a 
television receive by means of a cursor or the like. 

However, according to this conventional extraction method, 
a user himself should perform an operation for appointing a 
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keyword, so that it is complicated and at the same time,, it is 
difficult to extract a large number of keywords in a short 
period of time. 

On the other hand, a method called Japanese language 
morphological analysis is known as a general automatic keyword 
extraction method. However, according to this method causes, a 
program size and/or a dictionary size to be used becomes very 
large and at the same time, the CPU is to be subject to a large 
amount of loads. Consequently, it is extremely inefficient to 
use this method in home electric appliances such as a television 
receiver in which the throughput capacity or the memory capacity 
of the CPU is not so large. 

Further, a method called a character type separation method 
is also known as a general automatic keyword extraction method. 
According to this method, a keyword is to be extracted by 
detecting differences of character types among Chinese character, 
Katakana, Hiragana, alphabet, numerical character and the like. 
However, it is not possible to perform an extraction of a 
keyword for searching a program accurately only according to 
this character type separation method. More specifically, with 
respect to a name of a professional entertainer which has 
Chinese character for his family name and Hiragana or Katakana 
for his first name (for example, such a name as ^^akarl ISHIDA") , 
it is not possible to extract the whole name, because the family 
name and the first name are to be separated. Further, it is not 

3 



1 



possible either to extract a foreigner's name whose first name 
is written in alphabet and family name is written in Katakana or 
a foreigner's name in which • " (midpoint) is inserted between 
his first and family names (for example, such a name as ^^B • 
DooJey") , because the family name and the first name are to be 
separated. 

In view of the aforementioned aspect, the present invention 
was done according to a problem in which it becomes possible for 
a user to extract a keyword for searching the contents 
automatically, efficiently and moreover accurately from the 
title character string information and detailed character string 
information of contents such as EPG information even in home 
electric appliances in which the throughput capacity or the 
memory capacity of the CPU is not so large. 
DISCLOSURE OF THE INVENTION 

In order to solve this problem, the present applicant 
proposes an automatic keyword extraction apparatus which 
comprises first extraction means for performing an extraction of 
a keyword from title character string information of contents by 
using a first keyword dictionary in which a character string 
designating a sub-genre is registered; and a second extraction 
means for performing an extraction of a keyword from a detailed 
character string information of the contents by using a second 
keyword dictionary in which names of persons are registered and 
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for performing an extraction of a keyword by utilizing character 
type separation method . 

In this automatic keyword extraction apparatus^ a keyword 
is extracted from the title character string information of 
contents (for example^ title character string information in EPG 
information in case of a television broadcast) by using a first 
keyword dictionary in which a character string designating a 
sub-genre is registered. 

Also, a keyword is extracted from the detailed character 
string information of the contents (for example^ detailed 
character string information in EPG information in case of 
television broadcast) by using the second keyword dictionary in 
which names of persons are registered and at the same time, a 
keyword extraction is also performed by utilizing the character 
type separation method. In this regard, a name of a person which 
has Chinese character for the family name and Hiragana or 
Katakana for the first name is also extracted as a keyword if it 
is a name of a person registered in the second keyword 
dictionary. Also, even a name of a person which is not 
registered in the second keyword dictionary is extracted as a 
keyword by utilizing the character type separation method . 

As described above, a keyword can be extracted accurately 
using a small sized program or dictionary by carrying out the 
keyword extraction from the title character string information 
and the keyword extraction from the detailed character string 
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information with the help of keyword dictionaries different each 
other and a rule (whether or not character type separation 
method is utilized or the like) according to respective 
information. 

In this manner, it becomes possible for a user to extract a 
keyword for searching the contents automatically, efficiently 
and moreover accurately from the title character string 
information and detailed character string information of the 
contents such as EPG information even in home electric 
appliances in which the throughput capacity or the memory 
capacity of the CPU is not so large. 

It should be noted in this automatic keyword extraction 
apparatus that it is preferable, as one example, for the first 
extraction means to extract a keyword from a portion excluding a 
character string registered in a predetermined character string 
dictionary for exclusion within a title character string 
including a character string registered in a first keyword 
dictionary - 

In this manner, it becomes possible for a user to extract a 
keyword for searching contents from the title character string 
information and detailed character string information of 
contents such as EPG information automatically, efficiently and 
moreover accurately even in home electric appliances in which 
the throughput capacity or the memory capacity of the CPU is not 
so large. 



Further^ a first extraction means, as one example, in this 
automatic keyword extraction apparatus preferably extracts a 
character string separated by a special character other than 
Hiragana, Katakana, Chinese character, numerical character and 
alphabet as a keyword within a title character string including 
a character string registered in a first keyword dictionary. 

In this manner, with respect to a title which is not 
separated by such a special character, the title itself is to be 
extracted as a keyword in a form as it is by avoiding a 
plurality of character strings included in the title from being 
extracted as a separated keyword. 

The title which is not separated by such a special 
character is not useful so much as a keyword for searching 
contents, because the individual character string included in 
the title has an extremely broad meaning (because the search 
result has an extremely large amount of volumes) and it is often 
the case that the title itself becomes useful as a keyword for 
an efficient search of contents for the first time. Consequently, 
it becomes possible for a user to search contents still more 
effectively by using an extracted keyword (title itself) . 

Also on the other hand, with respect to a title separated a 
special character, an individual character string separated by a 
special character is to be extracted as each keyword- 

With respect to a title separated by a special character 
(for example, a space, ^'x" or the like) , each individual 



character string separated by a special character becomes useful 
as a keyword for searching contents and it is often the case 
that the title itself is too restricted to be useful as a 
keyword for searching contents (search result becomes zero or 
very few) . Consequently, it also becomes possible for a user to 
search contents still more effectively by using an extracted 
keyword (individual character string separated by a special 
character) . * 

Further, a second extraction means, as one example, in this 
automatic keyword extraction apparatus preferably performs an 
extraction of a keyword by utilizing a character type separation 
method from a portion excluding a character string registered in 
a predetermined character string dictionary for exclusion within 
a remaining portion of a detailed character string information 
whose keyword is extracted by using a second keyword dictionary. 

In this manner, it is possible to prevent a character 
string which is within character strings included in detailed 
character string information and is inappropriate for searching 
contents from being included in keywords- Consequently, it 
becomes possible for a user to search contents still more 
effectively by using extract keywords. 

Further, a second extraction means, as one example, in this 
automatic keyword extraction apparatus preferably treats 
Katakana and alphabet as the same character type while utilizing 
a character type separation method and at the same time, treats 
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• " (midpoint) as Katakana and alphabet in a case when letters 
just before is Katakana and alphabet respectively. 

In this manner, it becomes possible to extract a 
foreigner's name in which his first name is written in alphabet 
and his family name is written in Katakana or a foreigner's name 
in which • " (midpoint) is inserted between his first name and 

family name as a keyword. 

Further, it is preferable in this automatic keyword 
extraction apparatus that means for downloading the second 
keyword dictionary via a network is further comprised, wherein 
the second extraction means uses the downloaded second keyword 
dictionary. 

In this manner, it becomes possible to extract a keyword by 
using the newest dictionary (dictionary in which a name of a 
person who became famous just recently is also registered) as 
the second keyword dictionary. 

Next, the present applicant proposes an automatic keyword 
extraction method which comprises a first step for performing an 
extraction of a keyword from title character string information 
of contents by using a first keyword dictionary in which a 
character string designating a sub-genre is registered; and a 
second step for performing an extraction of a keyword from 
detailed character string information of the contents by using a 
second keyword dictionary in which names of persons are 
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registered and for performing an extraction of a keyword by 
utilizing character type separation method. 

Also, in a program of an automatic keyword extraction 
apparatus, a recording medium recorded with a program which can 
be read by a computer is proposed, wherein the program comprises 
first extracting step for performing an extraction of a keyword 
from title character string information of contents by using a 
first keyword dictionary in which a character string designating 
a sub-genre is registered; and a second extracting step for 
performing an extraction of a keyword from detailed character 
string information of the contents by using a second keyword 
dictionary in which names of persons are registered and for 
performing an extraction of a keyword by utilizing character 
type separation method. 

Also, a program is proposed wherein the program makes a 
computer which controls an automatic keyword extraction 
apparatus execute a first extracting step for performing an 
extraction of a keyword from title character string information 
of contents by using a first keyword dictionary in which a 
character string designating a sub-genre is registered; and a 
second extracting step for performing an extraction of a keyword 
from detailed character string information of the contents by 
using a second keyword dictionary in which names of persons are 
registered and for performing an extraction of a keyword by 
utilizing character type separation method. 
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According to the automatic keyword extraction method^ the 
recording medium or the program,, it becomes possible for a user 
to extract a keyword for searching contents automatically, 
efficiently and moreover accurately from title character string 
information and detailed character string information of 
contents such as EPG information even in home electric 
appliances in which the throughput capacity or the memory 
capacity of the CPU is not so large just similarly as explained 
for the automatic keyword extraction apparatus relating to the 
aforementioned present invention. 
BRIEF DESCRIPTION OF DRAWINGS 

FIG- 1 is a diagram showing an outline of a digital 
television broadcast receiving system which includes a program 
recording and reproducing apparatus applied with the present 
invention; 

FIG. 2 is a block diagram showing a hardware constitution 
of the program recording and reproducing apparatus of FIG. 1; 

FIG- 3 is a flowchart showing an automatic keyword 
extraction process executed by a CPU in FIG. 2; 

FIG. 4 is a flowchart showing an automatic keyword 
extraction process executed by a CPU in FIG. 2; 

FIG. 5 a diagram showing a rule for extracting a keyword in 
a process of FIG. 3; 

FIG. 6 a diagram showing a rule for extracting a keyword in 
a process of FIG. 4; and 
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FIG. 7a block diagram showing a hardware constitution of a 
program recording and reproducing apparatus for an analog 
television broadcast applied with the present invention. 
BEST MODE FOR PERFORMING THE INVENTION 

Hereinafter, an example applied with the present invention 
in an apparatus which records and reproduces a program of a 
digital television broadcast will be explained by using drawings . 

FIG. 1 is a diagram showing an outline of a digital 
television broadcast receiving system which includes a program 
recording and reproducing apparatus applied with the present 
invention. A digital transmission signal transmitted from a 
television broadcast station is received by an antenna 1 and 
inputted to a program recording and reproducing apparatus 2. The 
program recording and reproducing apparatus 2 is connected to a 
display apparatus 3 which includes a display and a speaker and 
at the same time connected to an internet 4 . 

FIG. 2 is a block diagram showing a hardware constitution 
of the program recording and reproducing apparatus 2. In the 
program recording and reproducing apparatus 2, a tuner 11,. a 
demodulator 12, a descrambler 13 and a multiplex separator 14 
are serially connected in this order and at the same time, with 
respect to the multiplex separator, a video decoder 15 and a 
video signal processing circuit 17 and in addition an audio 
decoder 16 and a D/A converter 18 are connected respectively in 
these orders. 
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Also, the tuner 11 to the D/A converter 18, a CPU 19, a ROM 
20, a main memory (RAM) 21, a flash memory 22, an interface 23 
for a remote controller, an interface 24 for an HDD (hard disk 
drive) and a communication interface 25 for an internet 
connection are connected one another by means of a system bus 26. 
An HDD (hard disk drive) 27 for picture-recording a television 
program is connected to the interface 24. 

In a remote controller (hereinafter referred to as REMOCON) 
28 attached to the program recording and reproducing apparatus 2, 
various kinds of operation buttons (a power supply button, a 
channel selection button, a picture-record reservation button, a 
playback button, a direction key for performing a selection on 
an EPG on the screen, a determination key and the like) same as 
those in a REMOCON attached to a television receiver for a 
normal digital broadcast are provided. 

When the television program is viewed and listened, the 
digital transmission signal inputted to the program recording 
and reproducing apparatus 2 is selected in frequency band by the 
tuner 11 according to a channel selection operation of the 
REMOCON 28 and thereafter is demodulated in the demodulator 12 
and descrambled in the descrambler 13, and thereafter is 
separated in the multiplex separator 14 into video and audio 
data packets of the program for multiple channels, into EPG 
information packets and the like. 

The video and audio data packets for one channel extracted 
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from the video and audio packets of the television program for 
multiple channels according to a channel selection operation of 
the REMOCON 28 are transmitted decoded to an MPEG-2 Video and an 
MPEG-2Audio by the video decoder 15 and audio decoder 16 
respectively. Also, the EPG information packets are transmitted 
to the CPU 19- 

Then, the video signal decoded in the video decoder 15 and 
an video signal for the electronic program guide display 
produced in the CPU 19 by using the EPG information are applied 
with a conversion to the NTSC system in the video signal 
processing circuit 17, mixing or the like and outputted from a 
video output terminal 2 9 so as to be transmitted to the display 
apparatus 3 of FIG. 1. 

Also, the audio signal decoded in the audio decoder 16 is 
analog-converted in the D/A converter 18 and outputted from an 
audio output terminal 30 so as to be transmitted to the display 
apparatus 3 of FIG. 1. 

The CPU 19 controls the whole program recording and 
reproducing apparatus 2 according to programs and data stored in 
the ROM 20 by using the main memory 21 as a working memory. 

In the processes performed by the CPU 19, there is an 
automatic keyword extraction process other than a process when 
viewing and listening to a television program according to a 
channel selection operation of the REMOCON 28 and a picture- 
recording process of the television program to the HDD 27 
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according to a picture-record reservation operation of the 
REMOCON 28. 

In the ROM 20, a keyword dictionary for titles, an excluded 
character string dictionary for titles, a keyword dictionary for 
detailed information and an excluded character string dictionary 
for detailed information are stored as dictionaries to be used 
for that automatic keyword extraction process. 

In the keyword dictionary for titles, there are registered 
character strings showing sub-genre (more detailed genre than a 
rough genre such as ^^sport" by the genre information in the EPG 
information) such as ^^professional baseball", ^^golf", ^^soccer", 
^'hot spring", GO", ^^Japanese chess", ^^movie" or the like; 

character strings such as '^passionate love" and ^^love"; and 
effective and moreover important character strings for searching 
a program within character strings which are often included in 
program titles such as character strings of baseball club names 
of the professional baseball . 

In the excluded character string dictionary for titles, 
there are registered extremely general character strings as 
keywords for searching programs within character strings 
included in program titles such as ^^movie", ^^BS", symbols unique 
in the program table (for example, a symbol of N surrounded by a 
square frame for designating a news program) . 

In the keyword dictionary for detailed information, there 
are respectively registered character strings of names written 
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in only Hiragana^ in a combination of Hiragana and Chinese 
character, in a combination of Hiragana and Katakana, in a 
combination of Chinese character and Katakana, in only Chinese 
character of equal to or less than 2 letters and in only Chinese 
character of equal to or more than 6 letters within names of 
famous people (professional entertainers, athletes, politicians, 
cultural figures or the like) who often appear in television 
programs. Also, in the keyword dictionary for detailed 
information, there are registered also character strings other 
than names of persons which are proper character strings as 
keywords for searching programs within character strings which 
are often included in detailed character string information in 
EPG information such as, for example, ^^hot spring''. 

In the excluded character string dictionary for detailed 
information, there are registered character strings 
inappropriate as keywords for searching programs within 
character strings which are often included in detailed character 
string information in EPG information such as ^^guest", ended'' 
and ^^manager". 

It should be noted relating to a keyword dictionary of 
detailed information that the CPU 19 downloads the newest one 
(one registered with names of persons who became famous just 
recently or the like) from an exclusive site via internet and 
stores it also in the flash memory 22, 

Also, CPU 19 stores EPG information packets which are 
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transmitted from the multiplex separator 14 on a channel 
selection operation of a user or on a picture recording 
according to a picture-record reservation operation of a user 
into the flash memory 22 on the assumption that an automatic 
keyword extraction process is carried out. 

FIGS. 3 and 4 are flowcharts respectively showing automatic 
keyword extraction processed executed by the CPU 19. 
FIG. 3 therein shows a process for extracting a keyword from 
title character string information, wherein title character 
string information is first picked up from EPG information 
stored in the flash memory 22 (step SI) . 

Subsequently, character strings registered in a keyword 
dictionary for titles (character strings showing sub-genres such 
as "'golf, ""soccer", ""hot spring", ""GO", ""Japanese chess", 
""movie" or the like) are searched from a plurality of program 
titles which the title character string information shows. Then, 
the whole title character strings in which character strings 
registered in the keyword dictionary for titles are included 
within these program titles are made to be a keyword extraction 
objective (step S2) . 

Subsequently, portions of character strings (""movie", ""BS" 
or the like ) which are registered in an excluded character 
string dictionary for titles within titles which are made to be 
a keyword extraction objective in step S2are substituted by 
spaces (step S3) . 
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Subsequently, a keyword is extracted from the title 
character strings which went through step according to an 
extraction rule of titles such as shown in FIG. 5 (step S4) . 

According to the extraction rule for the titles, the title 
character string is extracted as a keyword as it is in a case 
when the title character string is not separated by a special 
character (space, X , Tj or the like) other than characters of 
Hiragana, Katakana, Chinese character, numerical character and 
alphabet- On the other hand, in a case when the title character 
string is separated by such a special character, character 
strings of equal to or more than 2 letters are extracted as 
keywords respectively within respective character strings 
separated by the special character. 

However, • " (midpoint) is not treated as a special 
character- Then, in a case when there exists a • " (midpoint) at 
the head or at the tail end of the character string extracted as 
a keyword, a portion excluding the ^^-^ (midpoint) is made to be a 
keyword - 

Finally, the keywords extracted in step S4 are stored in 
the flash memory 22 as a keyword list the title character string 
information (step S5) - 

Next, FIG. 4 is a process for extracting a keyword from 
detailed character string information and first, detailed 
character string information is picked up from EPG information 
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stored in the flash memory 22 (step Sll) . 

Subsequently^ a character string (famous person's name or 
the like) registered in the keyword dictionary for detailed 
information is searched from that detailed character string 
information. Then, a character string registered in the keyword 
dictionary for the detail information is extracted as a keyword 
within the detailed character string information and at the same 
time> the portion of the character string is substituted by a 
half size space (step S12) . 

Subsequently, the portions of character strings (^^guest", 
^^ended'', ^^manager'' or the like) which are registered in an 
excluded character string dictionary for detailed information 
and are within the character strings of detailed character 
string information which went through step S12 are substituted 
by half size spaces (step S13) . 

Subsequently, a keyword is extracted from the character 
strings of detailed character string information which went 
through step S13 according to an extraction rule of the detailed 
character string information such as shown in FIG. 6 (step S14) . 

According to the extraction rule for the detailed character 
string information, a character type separation method which 
separates Hiragana, Katakana, Chinese character, numerical 
character, alphabet and other character-type letters to one 
another is basically utilized. 

However, Katakana and alphabet are treated as same 

19 



character types (not separated). Also, • " (midpoint) is treated 
as Katakana and alphabet in a case when the letter just before 
it are Katakana and alphabet respectively (not separated) . 

Then, character strings excluding character strings 
composed of only Hiragana, character strings composed of only 
Chinese character equal to or less than 2 letters and character 
strings of only Chinese character equal to or more than 6 
letters are extracted as keywords within the separated 
respective character strings respectively. However, a portion 
excluding • " (midpoint) is made as a keyword in a case when • " 

(midpoint) exists at the head or at the tail end of a character 
string extracted as a keyword. 

Finally, keywords extracted in step S12 and keywords 
extracted in step S14 are stored in the flash memory 22 as a 
list of keywords in the detailed character string information 

(step S15) - 

Next, an aspect where a keyword for a program search is 
extracted in the program recording and reproducing apparatus 2 
will be explained by citing an embodiment. 

It is assumed that titles such as, for example, shown next 
are included within the title character string information in 
the EPG information which is transmitted from the multiplex 
separator 14 in case of a channel selection operation of a user 
or in case of a picture recording according to a picture-record 
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reservation operation of a user and is stored in the flash 
memory 22 (here, □ □ and A A designate professional baseball 

team-names) . 

Much ado about nothing of love 

professional baseball relay broadcast □□ X AA 
BS movie Tspace-warsJ 

Then, since character strings called "'love", ''professional 
baseball" and ^^movie" are registered in the keyword dictionary 
for titles in the process of FIG. 3, the whole character strings 
of the titles become extraction objectives of keywords in step 
S2 respectively with respect to these respective titles. 

Then, with respect to the BS movie Fspace-warsJ within these 
titles, the portion ^^BS" and the portion ''movie" are substituted 
by spaces in step S3. 

Also, with respect to the professional baseball relay 
broadcast □□ X AA within these titles, a space (particular 
symbol) exists between "professional baseball relay broadcast" 
and " □ □ " and X (particular symbol) also exists between " □ □ " 
and " A A ", so that character string s "professional baseball 
relay broadcast", "□□", and "A A" are extracted as keywords in 
step S4 respectively. 

Also, with respect to Tspace-warsJ within these titles where 
portions of "BS" and "movie" are substituted by spaces, it is 
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separated by Tj (particular symbol) and • " (midpoint) is not 
handled as a particular symbol either^ so that ^^space- wars" which 
is the title itself of the original movie is extracted as a 
keyword in step S4- 

Also, ^^Much ado about nothing of love'' within these titles 
is not separated by a particular symbol, so that ^^Much ado about 
nothing of love'' which is the title itself is extracted as a 
keyword in step S4. 

Consequently, character strings shown in below are stored 
in the flash memory 22 in step S5 as keywords for the program 
search (as mentioned above, □ □ and A A are professional 

baseball team-names) . 

Much ado about nothing of love 
Professional baseball relay broadcast 
□ □ 
AA 

Space*wars 

In this manner, with respect to titles such as ^'Much ado 
about nothing of love" and '^space • wars" which are not separated 
by special characters, the titles themselves are extracted 
according the process of FIG. 3 as keywords in a form as they 
are with a situation where a plurality of character strings 
included in the titles are not extracted as random keywords. 

The title which is not separated by such a special 
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character is not useful so much as a keyword for a program 
search, because an individual character string of ^^love", 
^^space" or the like included in the title has an extremely broad 
meaning (search result becomes very many) and it is often the 
case that the title itself is useful for the first time as a 
keyword for an efficient program search. Consequently, it 
becomes possible for a user to search a program efficiently by 
using the extracted keyword (title itself) . 

In addition, with respect to the title character string of 
a movie called ^''space • wars", a character string such as ^^BS" and 
^^movie'^' which were added to the title for the title character 
string information and are too general for a program search is 
not included in the keyword and at the same time, f j " which 
surrounded the title for the title character string information 
is not included in the keyword either. Consequently, it becomes 
possible for a user to search a program efficiently. 

On the other hand, with respect to . the titles such as 
^^professional baseball relay broadcast □ □ X A A " which are 
separated by special characters (space, ^^x", etc.), 
""professional baseball relay broadcast", "" □ □ " and ""A A" which 
are individual character strings separated by special characters 
are extracted as keywords respectively according to the process 
of FIG. 3. 

With respect to the title which is separated by such 
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special characters^ individual character strings separated by 
the special characters are useful as keywords for the program 
search respectively and it is often the case that the title 
itself is not so useful as a keyword for the program search, 
because is too restricted (search result becomes zero or very 
few, because the title changes to another when the adversary 
team (concrete name such as and ^"AA") becomes different) . 

Consequently, it also becomes possible for a user to search a 
program efficiently by using the extracted keyword (individual 
character strings separated by special characters) . 

On the other hand, according to a process of FIG. 4, names 
or the like of famous people (chairman of program '^Much ado 
about nothing of love'' and a guest thereof and an actor who 
takes part in a movie ^'space • wars" ) which are registered in the 
keyword dictionary for detailed information are extracted as 
keywords in step S12 from the detailed character string 
information of the programs having these titles in the EPG 
information stored in the flash memory 22- 

In this regard, names of famous people who have Chinese 
character for their family names and Hiragana or Katakana for 
their first names (for example, name of akari ISHIDA) are also 
registered in the keyword dictionary for the detail information, 
so that the names of such famous people are also extracted as 
keywords . 
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Also, a name of a person who became famous just recently is 
also extracted as a keyword, because the newest keyword 
dictionary for the detail information downloaded via internet is 
also used. 

Also, the portions of the names or the like of the famous 
people and the portions of character strings ('^ guest", ^'ended'', 
^"manager" or the like) which are registered in an excluded 
character string dictionary* for detailed information, within the 
character strings of detailed character string information, are 
substituted by half size spaces in step S12 and Sl3. 

Then, a keyword is extracted in step S14 from the character 
strings of detailed character string information whose spaces 
were substituted according to an extraction rule shown in FIG. 6. 

In this regard, Katakana and alphabet are handled as the 
same character type and at the same time, • " (midpoint) is 
handled as Katakana and alphabet in a case when a letter just 
before it is Katakana and alphabet respectively, so that a 
foreigner's name in which • " (midpoint) is inserted between his 
first and family names (for example, ^'B • Dooley"') can be also 
extracted as a keyword. 

Also, a name of a person which is not registered in the 
newest keyword dictionary for the detail information yet (for 
example, obscure professional entertainer who just debuted) can 
be extracted as a keyword unless it is a name composed of only 



25 



Hiragana, a name composed of only Chinese character equal to or 
less than 2 letters or a name composed of only Chinese character 
equal to or more than 6 letters (more specifically, unless it is 
a name which is not much likely to be as a name of a person) . 

Also, character strings such as guest", ^^ended" and 
^^manager'" which are inappropriate character strings for the 
program search are never extracted as a keyword, because they 
received the space substitution. 

In this manner, in step S15, a name of a famous person who 
has Chinese character for his family name and Hiragana or 
Katakana for his first name, a name of a person who became 
famous just recently, a foreigner's name whose first name is 
written in alphabet and family name is written in Katakana and a 
foreigner's name in which • " (midpoint) is inserted between his 
first and family names are also stored in the flash memory 22 as 
keywords for the program search- Consequently, it becomes 
possible for a user to search a program efficient by using the 
extracted keyword . 

It should be noted for a method where a user uses the 
keywords stored in the flash memory 22 by the processes of FIGS. 
3 and 4 for the program search that a proper method can be 
applied such that the CPU 19 produces, for example, according to 
a predetermined operation of the REMOCON 28, a video signal of a 
picture screen for the program search (picture screen for 
displaying keywords in a list and also for a user to select 
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desired keywords therein so as to instruct a search) and 
transmits it to the display apparatus 3 by way of the video 
signal processing circuit 17 and the video output terminal 29. 

As mentioned above ^ it is constituted in the program 
recording and reproducing apparatus 2 such that a keyword can be 
extracted accurately using a small sized program or dictionary 
by carrying out the keyword extraction from the title character 
string information in the EPG information and the keyword 
extraction from the detailed character string information with 
the help of keyword dictionaries different each other and a rule 
according to respective information. 

In this manner, it is constituted such that a user can 
extract a keyword for searching a program from the title 
character string information and detailed character string 
information in the EPG information automatically, efficiently 
and moreover accurately even though the throughput capacity or 
the memory (ROM 20, flash memory 22 or the like), capacity of the 
CPU 19 is not so large. 

It should be noted in the above examples that the present 
invention is applied to an apparatus which records and 
reproduces a program of a digital television broadcast. However, 
the present invention is not restricted by this and it is 
needless to say that the present invention can be also applied 
to a program recording and reproducing apparatus which records 
and reproduces a program of an analog television broadcast - 
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FIG- 7a block diagram showing a hardware constitution of a 
program recording and * reproducing apparatus for an analog 
television broadcast applied with the present invention. A video 
and an audio signal in an analog transmission signal which are 
received by an antenna 31 and inputted to a program recording 
and reproducing apparatus 41 is selected by a tuner 42 for a 
frequency band and encoded by an MPEG encoder 43. 

When viewing and listening to a television program, this 
encode video and audio data are decoded by an MPEG decoder 4 7 
and transmitted to a display apparatus 61 by way of a program 
recording and reproducing apparatus 41. 

On the other hand, when recording the television program, 
the video and audio data encoded by the MPEG encoder 4 3 are 
transmitted to a main storage device 45 through a bus 44 so as 
to be recorded in the main storage device 45. 

Then, when reproducing, video and audio data read out from 
the main storage device 45 are transmitted to the MPEG decoder 
47 through the bus 44, decoded in the MPEG decoder 4 7 and 
transmitted to the display apparatus 61 by way of the program 
recording and reproducing apparatus 41. 

Also, in an EPG obtaining module 4 6, EPG information is 
obtained from an analog transmission signal which is selected in 
a frequency band by the tuner 42. This EPG information is also 
transmitted to the main storage device 4 5 through the bus 44 and 
stored in the main storage device 45. 
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AlsOr a communication interface 48, a ROM 4 9, a main 
storage device 50, an auxiliary storage device 51 and an MPEG 
decoder 47 for being connected with an internet 71 are connected 
to a bus 52 respectively. 

Also in this program recording and reproducing apparatus 41, 
the keyword dictionary for titles, the excluded character string 
dictionary for titles and the keyword dictionary for and the 
excluded character string dictionary detailed information as 
mentioned above are stored in the ROM 4 9 (relating to a keyword 
dictionary for detailed information, the newest is one 
downloaded from an exclusive site via internet and stored also 
in the flash memory 51) and at the same time, a CPU 53 which 
controls the whole program recording and reproducing apparatus 
41 performs an automatic keyword extraction process same as that 
in FIG. 3 and FIG. 4 by using these dictionaries and the EPG 
information in the main storage device 45, and the extracted 
keywords are stored in the auxiliary storage device 51. 

It is constituted also in this program recording and 
reproducing apparatus 41 quite similarly as that explained with 
respect to the program recording and reproducing apparatus 2 of 
FIGS- 1 and 2 such that a keyword can be extracted accurately 
using a small sized program or dictionary by carrying out the 
keyword extraction from the title character string information 
in the EPG information and the keyword extraction from the 
detailed character string information with the help of keyword 
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dictionaries different each other and a rule according to 
respective information . 

In this manner, it is constituted such that a user can 
extract a keyword for searching a program from the title 
character string information and detailed character string 
information in the EPG information automatically, efficiently 
and moreover accurately even though the throughput capacity or 
the memory (ROM 4 9, flash memory 51 or the like) capacity of the 
CPU 53 is not so large. 

Also, in the above examples, the present invention is 
applied to a program recording and reproducing apparatus which 
has a body separate from a display apparatus- However, the 
present invention is not restricted by this and can be applied 
also to a television receiver where a program recording and 
reproducing apparatus and a display apparatus are formed as one 
body configuration or a television receiver which does not have 
a recording and reproducing function of the program. 

Also, in the above examples, the present invention is 
applied for searching a keyword from the title character string 
information of the program and detailed character string 
information in the EPG information. However, the present 
invention is not restricted by this and can be applied for 
searching a keyword from title character string information and 
detailed character string information of contents other than the 
television program (for example, contents delivered via 
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internet) . 

Also, it is needless to say that the present invention is 
not restricted by the above examples and other various kinds of 
constitutions can be taken without departing from the scope of 
the present invention . 

As mentioned above, an effect is obtained according to the 
present invention such that it becomes possible for a user to 
extract a keyword for searching the program automatically, 
efficiently and moreover accurately from the title character 
string information and detailed character string information of 
the program such as EPG information even in home electric 
appliances in which the throughput capacity or the memory 
capacity of the CPU is not so large . 
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